Goto

Collaborating Authors

 simple approach


A Simple Approach to Automated Spectral Clustering

Neural Information Processing Systems

The performance of spectral clustering heavily relies on the quality of affinity matrix. A variety of affinity-matrix-construction (AMC) methods have been proposed but they have hyperparameters to determine beforehand, which requires strong experience and leads to difficulty in real applications, especially when the inter-cluster similarity is high and/or the dataset is large. In addition, we often need to choose different AMC methods for different datasets, which still depends on experience. To solve these two challenging problems, in this paper, we present a simple yet effective method for automated spectral clustering. First, we propose to find the most reliable affinity matrix via grid search or Bayesian optimization among a set of candidates given by different AMC methods with different hyperparameters, where the reliability is quantified by the \textit{relative-eigen-gap} of graph Laplacian introduced in this paper. Second, we propose a fast and accurate AMC method based on least squares representation and thresholding and prove its effectiveness theoretically. Finally, we provide a large-scale extension for the automated spectral clustering method, of which the time complexity is linear with the number of data points. Extensive experiments of natural image clustering show that our method is more versatile, accurate, and efficient than baseline methods.


Attracting and Dispersing: A Simple Approach for Source-free Domain Adaptation

Neural Information Processing Systems

We propose a simple but effective source-free domain adaptation (SFDA) method. Treating SFDA as an unsupervised clustering problem and following the intuition that local neighbors in feature space should have more similar predictions than other features, we propose to optimize an objective of prediction consistency. This objective encourages local neighborhood features in feature space to have similar predictions while features farther away in feature space have dissimilar predictions, leading to efficient feature clustering and cluster assignment simultaneously. For efficient training, we seek to optimize an upper-bound of the objective resulting in two simple terms. Furthermore, we relate popular existing methods in domain adaptation, source-free domain adaptation and contrastive learning via the perspective of discriminability and diversity. The experimental results prove the superiority of our method, and our method can be adopted as a simple but strong baseline for future research in SFDA. Our method can be also adapted to source-free open-set and partial-set DA which further shows the generalization ability of our method.


A Simple Approach to Automated Spectral Clustering Appendices Jicong Fan 1, 2, Yiheng T u

Neural Information Processing Systems

K, when n is large (e.g. The time complexity is O ( kτn). We have the following result. It shows that when two data points in X, e.g. Hence KLSR with Gaussian kernel utilizes local information to enhance C . The algorithm of AutoSC-GD with only LSR and KLSR is shown in Algorithm 1.


A Surprisingly Simple Approach to Generalized Few-Shot Semantic Segmentation

Neural Information Processing Systems

The goal of *generalized* few-shot semantic segmentation (GFSS) is to recognize *novel-class* objects through training with a few annotated examples and the *base-class* model that learned the knowledge about the base classes.Unlike the classic few-shot semantic segmentation, GFSS aims to classify pixels into both base and novel classes, meaning it is a more practical setting.Current GFSS methods rely on several techniques such as using combinations of customized modules, carefully designed loss functions, meta-learning, and transductive learning.However, we found that a simple rule and standard supervised learning substantially improve the GFSS performance.In this paper, we propose a simple yet effective method for GFSS that does not use the techniques mentioned above.Also, we theoretically show that our method perfectly maintains the segmentation performance of the base-class model over most of the base classes.Through numerical experiments, we demonstrated the effectiveness of our method.It improved in novel-class segmentation performance in the 1 -shot scenario by 6.1 % on the PASCAL- 5 i dataset, 4.7 % on the PASCAL- 10 i dataset, and 1.0 % on the COCO- 20 i dataset.Our code is publicly available at https://github.com/IBM/BCM.


Attracting and Dispersing: A Simple Approach for Source-free Domain Adaptation

Neural Information Processing Systems

We propose a simple but effective source-free domain adaptation (SFDA) method. Treating SFDA as an unsupervised clustering problem and following the intuition that local neighbors in feature space should have more similar predictions than other features, we propose to optimize an objective of prediction consistency. This objective encourages local neighborhood features in feature space to have similar predictions while features farther away in feature space have dissimilar predictions, leading to efficient feature clustering and cluster assignment simultaneously. For efficient training, we seek to optimize an upper-bound of the objective resulting in two simple terms. Furthermore, we relate popular existing methods in domain adaptation, source-free domain adaptation and contrastive learning via the perspective of discriminability and diversity.


L3Cube-IndicSBERT: A simple approach for learning cross-lingual sentence representations using multilingual BERT

Deode, Samruddhi, Gadre, Janhavi, Kajale, Aditi, Joshi, Ananya, Joshi, Raviraj

arXiv.org Artificial Intelligence

The multilingual Sentence-BERT (SBERT) models map different languages to common representation space and are useful for cross-language similarity and mining tasks. We propose a simple yet effective approach to convert vanilla multilingual BERT models into multilingual sentence BERT models using synthetic corpus. We simply aggregate translated NLI or STS datasets of the low-resource target languages together and perform SBERT-like fine-tuning of the vanilla multilingual BERT model. We show that multilingual BERT models are inherent cross-lingual learners and this simple baseline fine-tuning approach without explicit cross-lingual training yields exceptional cross-lingual properties. We show the efficacy of our approach on 10 major Indic languages and also show the applicability of our approach to non-Indic languages German and French. Using this approach, we further present L3Cube-IndicSBERT, the first multilingual sentence representation model specifically for Indian languages Hindi, Marathi, Kannada, Telugu, Malayalam, Tamil, Gujarati, Odia, Bengali, and Punjabi. The IndicSBERT exhibits strong cross-lingual capabilities and performs significantly better than alternatives like LaBSE, LASER, and paraphrase-multilingual-mpnet-base-v2 on Indic cross-lingual and monolingual sentence similarity tasks. We also release monolingual SBERT models for each of the languages and show that IndicSBERT performs competitively with its monolingual counterparts. These models have been evaluated using embedding similarity scores and classification accuracy.


A simple approach for quantizing neural networks

Maly, Johannes, Saab, Rayan

arXiv.org Artificial Intelligence

In this short note, we propose a new method for quantizing the weights of a fully trained neural network. A simple deterministic pre-processing step allows us to quantize network layers via memoryless scalar quantization while preserving the network performance on given training data. On one hand, the computational complexity of this pre-processing slightly exceeds that of state-of-the-art algorithms in the literature. On the other hand, our approach does not require any hyper-parameter tuning and, in contrast to previous methods, allows a plain analysis. We provide rigorous theoretical guarantees in the case of quantizing single network layers and show that the relative error decays with the number of parameters in the network if the training data behaves well, e.g., if it is sampled from suitable random distributions. The developed method also readily allows the quantization of deep networks by consecutive application to single layers.


Applications of Autoencoders part2 (Machine Learning)

#artificialintelligence

Abstract: We discuss a simple approach to transform autoencoders into "pattern filters". Besides filtering, we show how this simple approach can be used also to build robust classifiers, by learning to filter only patterns of a given class. Abstract: The problem of permutation-invariant learning over set representations is particularly relevant in the field of multi-agent systems -- a few potential applications include unsupervised training of aggregation functions in graph neural networks (GNNs), neural cellular automata on graphs, and prediction of scenes with multiple objects. Yet existing approaches to set encoding and decoding tasks present a host of issues, including non-permutation-invariance, fixed-length outputs, reliance on iterative methods, non-deterministic outputs, computationally expensive loss functions, and poor reconstruction accuracy. In this paper we introduce a Permutation-Invariant Set Autoencoder (PISA), which tackles these problems and produces encodings with significantly lower reconstruction error than existing baselines.


GausSetExpander: A Simple Approach for Entity Set Expansion

Diallo, Aïssatou, Fürnkranz, Johannes

arXiv.org Artificial Intelligence

Entity Set Expansion (ESE) is an important task in Natural Language Processing that aims at expanding a small set of entities into a larger one with items from a large pool of candidates. The problem implicitly requires the definition of the notion of similarity between the given entities and the candidates. In this paper, we propose GausSetExpander, an unsupervised approach for the task of ESE based on optimal transport techniques. We propose to re-frame the problem as choosing the entity that best completes the input set. For this, we interpret a set as an elliptical distribution with a centroid which represents the mean and a dispersion that serves as the spread of the variance. The best candidate entity is the one that increases the spread of the set the least. We analyze the strength and the weaknesses of the proposed solution in order to assess the validity of our proposed approach.


A Simple Baseline for Beam Search Reranking

Vassertail, Lior, Levy, Omer

arXiv.org Artificial Intelligence

Reranking methods in machine translation aim to close the gap between common evaluation metrics (e.g. BLEU) and maximum likelihood learning and decoding algorithms. Prior works address this challenge by training models to rerank beam search candidates according to their predicted BLEU scores, building upon large models pretrained on massive monolingual corpora -- a privilege that was never made available to the baseline translation model. In this work, we examine a simple approach for training rerankers to predict translation candidates' BLEU scores without introducing additional data or parameters. Our approach can be used as a clean baseline, decoupled from external factors, for future research in this area.